Classification of English language learner writing errors using a parallel corpus with SVM

نویسندگان

Brendan Flanagan

Chengjiu Yin

Takahiko Suzuki

Sachio Hirokawa

چکیده

In order to overcome mistakes, learners need feedback to prompt reflection on their errors. This is a particularly important issue in education systems, as the system effectiveness in finding errors or mistakes could have an impact on learning. Finding errors is essential to providing appropriate guidance in order for learners to overcome their flaws. Traditionally the task of finding errors in writing takes time and effort. The authors of this paper have a long-term research goal of creating tools for learners, especially autonomous learners, to enable them to be more aware of their errors and provide a way to reflect on the errors. As a part of this research, we propose the use of a classifier to automatically analyse and determine the errors in foreign language writing. For the experiment in this paper we collected random sentences from the Lang-8 website that had been written by foreign language learners. Using predefined error categories, we manually classified the sentences to use as machine learning training data. This was then used to train a classifier by applying SVM machine learning to the training data. As the manual classification of training data takes time, it is intended that the classifier would be used to accelerate the process used for generating further training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

Hedges, as tools to express tentativeness and doubt, have been studied in plenty of research papers in the Iranian EFL research setting. However, their use in a learner corpus, portraying Iranian learner English, is in need of more research attention. With this end in view, this study aimed at investigating how Iranian EFL learners who have majored in English-related fields in Iran deployed hed...

متن کامل

How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs

Many elements contribute to the relative difficulty in acquiring specific aspects of English as a foreign language (Goldschneider & DeKeyser, 2001). Modal auxiliary verbs (e.g. could, might), are examples of a structure that is difficult for many learners. Not only are they particularly complex semantically, but especially in the Malaysian context ...

متن کامل

Metadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners

Different issues have been probed in learner corpus research since the late 1980s.However, taking the im- portance of meta discourse markers (MDMs) in signposting academic discourse, their use in Iranian EFL learners‟ academic essays is an area of research in need of a more serious analysis. Contributing to this line of investigation, this paper reports a corpus-based study of the use of MDMs i...

متن کامل

Error Analysis of Taiwanese University Students’ English Essay Writing: A Longitudinal Corpus Study

Writing is considered one of the most difficult skills in EFL/ESL. Thus, meticulous recognition and classification of students’ errors in certain contexts is a worthwhile endeavor which provides us with both diagnostic and prognostic power. Accordingly, a total of 430 students in 15 English writing classes held during 12 consecutive semesters in a private university in central Taiwan were the s...

متن کامل

Learning with Learner Corpora: using the TLE for Native Language Identification

This study investigates the usefulness of the Treebank of Learner English (TLE) when applied to the task of Native Language Identification (NLI). The TLE is effectively a parallel corpus of Standard/Learner English, as there are two versions; one based on original learner essays, and the other an error-corrected version. We use the corpus to explore how useful a parser trained on ungrammatical ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

I. J. Knowledge and Web Intelligence

دوره 5 شماره

صفحات -

تاریخ انتشار 2014

Classification of English language learner writing errors using a parallel corpus with SVM

نویسندگان

چکیده

منابع مشابه

Hedges in English for Academic Purposes: A Corpus-based study of Iranian EFL learners

How textbooks (and learners) get it wrong: A corpus study of modal auxiliary verbs

Metadiscourse Markers in a Corpus of Learner Language: The Case of Iranian EFL Learners

Error Analysis of Taiwanese University Students’ English Essay Writing: A Longitudinal Corpus Study

Learning with Learner Corpora: using the TLE for Native Language Identification

عنوان ژورنال:

اشتراک گذاری